46 research outputs found

    Sentiment Analysis in Spanish

    Get PDF
    Tesis doctoral elaborada por E. Martínez Cámara en la Universidad de Jaén bajo la dirección de los doctores D. L. Alfonso Ureña López y Da. M. Teresa Martín Valdivia. La defensa tuvo lugar el 26 de octubre de 2015 en Jaén ante el tribunal formado por la doctora Da. María Teresa Taboada Gómez de la Universidad Simon Fraser (Canadá) como presidenta, por el doctor D. José Manuel Perea Ortega de la Universidad de Extremadura (España) como secretario y por la doctora Da. Alexandra Balahur Dobrescu del Joint Research Centre (Italia) de la Comisión Europea como vocal. La tesis obtuvo la mención Internacional y logró una calificación de Sobresaliente Cum Laude.Ph.D. thesis written by Eugenio Martínez Cámara at the University of Jaén under the supervision of the Ph.D. L. Alfonso Ureña López and the Ph.D. M. Teresa Martín Valdivia. The author was examined on 26st October 2015 by a pannel composed by the Ph.D. María Teresa Taboada Gómez from the Simon Fraser University (Canada) as president of the pannel, the Ph.D. José Manuel Perea Ortega from the University of Extremadura (Spain) as secretary of the pannel and the Ph.D. Alexandra Balahur Dobrescu from the Joint Research Centre (Italy) of the European Comission as a panel member. The Ph.D. was awared Summa cum laude and it obtained the International mention.Este trabajo de investigación ha sido parcialmente financiado por el Fondo Europeo de Desarrollo Regional (FEDER), el proyecto FIRST FP7-287607 del Séptimo Programa Marco para el Desarrollo de la Investigación y la Tecnología de la Comisión Europea; el proyecto ATTOS TIN2012-38536-C03-0 del Ministerio de Economía y Competitividad y el proyecto AROESCU P11-TIC-7684 MO de Excelencia de la Junta de Andalucía

    A survey on extremism analysis using natural language processing: definitions, literature review, trends and challenges

    Get PDF
    Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.Extremism has grown as a global problem for society in recent years, especially after the apparition of movements such as jihadism. This and other extremist groups have taken advantage of different approaches, such as the use of Social Media, to spread their ideology, promote their acts and recruit followers. The extremist discourse, therefore, is reflected on the language used by these groups. Natural language processing (NLP) provides a way of detecting this type of content, and several authors make use of it to describe and discriminate the discourse held by these groups, with the final objective of detecting and preventing its spread. Following this approach, this survey aims to review the contributions of NLP to the field of extremism research, providing the reader with a comprehensive picture of the state of the art of this research area. The content includes a first conceptualization of the term extremism, the elements that compose an extremist discourse and the differences with other terms. After that, a review description and comparison of the frequently used NLP techniques is presented, including how they were applied, the insights they provided, the most frequently used NLP software tools, descriptive and classification applications, and the availability of datasets and data sources for research. Finally, research questions are approached and answered with highlights from the review, while future trends, challenges and directions derived from these highlights are suggested towards stimulating further research in this exciting research area.CRUE-CSIC agreementSpringer Natur

    CRiSOL: Opinion Knowledge-base for Spanish

    Get PDF
    El presente trabajo se centra en la clasificación de polaridad de comentarios de hoteles en español (COAH) y presenta un nuevo recurso léxico, CRiSOL. Este nuevo recurso toma como base la lista de palabras de opinión iSOL, a la cual incluye los valores de polaridad de los synsets de SentiWordNet. Debido a que SentiWordNet no es un recurso para español, se ha tenido que usar como pivote la versión española de WordNet incluida en el Repositorio Central Multilingüe (MCR). Se ha desarrollado un clasificador de la polaridad no supervisada para evaluar la validez de CRiSOL. Los resultados obtenidos con CRiSOL superan los obtenidos por los lexicones base iSOL y SentiWordNet por separado, lo cual nos anima a seguir trabajando en esta línea.In this paper we focus on Spanish polarity classification in a corpus of hotel reviews (COAH) and we introduce a new lexical resource called CRiSOL. This new resource is built on the list of Spanish opinion words iSOL. CRiSOL appends to each word of iSOL the polarity value of the related synset of SentiWordNet. Due to the fact that SentiWordNet is not a Spanish linguistic resource, a Spanish version of WordNet had to be used. The Spanish version of WordNet chosen was Multilingual Central Repository (MCR). An unsupervised classifier has been developed with the aim of assessing the validity of CRiSOL. The results reached by CRiSOL are higher than the ones reached by iSOL and SentiWordNet, so that encourage us to continue this research line.Esta investigación ha sido parcialmente financiada por el Fondo Europeo de Desarrollo Regional (FEDER), el proyecto ATTOS (TIN2012-38536-C03-0) del Gobierno de España y el proyecto AORESCU (P11-TIC-7684 MO) del gobierno autonómico de la Junta de Andalucía. Por último, el proyecto CEATIC (CEATIC-2013-01) de la Universidad de Jaén también ha financiado parcialmente este artículo

    Explainable Crowd Decision Making methodology guided by expert natural language opinions based on Sentiment Analysis with Attention-based Deep Learning and Subgroup Discovery

    Get PDF
    There exist a high demand to provide explainability to artificial intelligence systems, where decision making models are included. This paper focuses on crowd decision making using natural language evaluations from social media with the aim to provide explainability. We present the Explainable Crowd Decision Making based on Subgroup Discovery and Attention Mechanisms (ECDM-SDAM) methodology as an a posteriori explainable process that captures the wisdom of crowds that is naturally provided in social media opinions. It extracts the opinions from social media texts using a deep learning based sentiment analysis approach called Attention based Sentiment Analysis Method. The methodology includes a backward process that provides explanations to justify its sense-making procedure by applying mainly the attention mechanism on texts and subgroup discovery on opinions. We evaluate the methodology in the real case study of the TripR-2020Large dataset for restaurant choice. The results show that the ECDM-SDAM methodology provides easy understandable explanations that elucidates the key reasons that support the output of the decision processPID2020-119478GBI00,PID2019-103880RB-I00PID2020-116118GA-I00MCIN/AEI/10.13039/501100011033ERDF A way of making EuropePRE2018-083884 funded by MCIN/AEI/10.13039/501100011033ESF Investing in your futureUniversidad de Granada / CBU

    A survey on extremism analysis using natural language processing: definitions, literature review, trends and challenges

    Get PDF
    Extremism has grown as a global problem for society in recent years, especially after the apparition of movements such as jihadism. This and other extremist groups have taken advantage of different approaches, such as the use of Social Media, to spread their ideology, promote their acts and recruit followers. The extremist discourse, therefore, is reflected on the language used by these groups. Natural language processing (NLP) provides a way of detecting this type of content, and several authors make use of it to describe and discriminate the discourse held by these groups, with the final objective of detecting and preventing its spread. Following this approach, this survey aims to review the contributions of NLP to the field of extremism research, providing the reader with a comprehensive picture of the state of the art of this research area. The content includes a first conceptualization of the term extremism, the elements that compose an extremist discourse and the differences with other terms. After that, a review description and comparison of the frequently used NLP techniques is presented, including how they were applied, the insights they provided, the most frequently used NLP software tools, descriptive and classification applications, and the availability of datasets and data sources for research. Finally, research questions are approached and answered with highlights from the review, while future trends, challenges and directions derived from these highlights are suggested towards stimulating further research in this exciting research area.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature

    TASS 2015 – La evolución de los sistemas de análisis de opiniones para español

    Get PDF
    El análisis de opiniones en microblogging sigue siendo una tarea de actualidad, que permite conocer la orientación de las opiniones que minuto tras minuto se publican en medios sociales en Internet. TASS es un taller de participación que tiene como finalidad promover la investigación y desarrollo de nuevos algoritmos, recursos y técnicas aplicado al análisis de opiniones en español. En este artículo se describe la cuarta edición de TASS, resumiendo las principales aportaciones de los sistemas presentados, analizando los resultados y mostrando la evolución de los mismos. Además de analizar brevemente los sistemas que se presentaron, se presenta un nuevo corpus de tweets etiquetados en el dominio político, que se desarrolló para la tarea de Análisis de Opiniones a nivel de Aspecto.Sentiment Analysis in microblogging continues to be a trendy task, which allows to understand the polarity of the opinions published in social media. TASS is a workshop whose goal is to boost the research on Sentiment Analysis in Spanish. In this paper we describe the fourth edition of TASS, showing a summary of the systems, analyzing the results to check their evolution. In addition to a brief description of the participant systems, a new corpus of tweets is presented, compiled for the Sentiment Analysis at Aspect Level task.This work has been partially supported by a grant from the Fondo Europeo de Desarrollo Regional (FEDER), REDES project (TIN2015-65136-C2-1-R) and Ciudad2020 (INNPRONTA IPT-20111006) from the Spanish Government

    Dynamic Defense Against Byzantine Poisoning Attacks in Federated Learning

    Get PDF
    Federated learning, as a distributed learning that conducts the training on the local devices without accessing to the training data, is vulnerable to Byzatine poisoning adversarial attacks. We argue that the federated learning model has to avoid those kind of adversarial attacks through filtering out the adversarial clients by means of the federated aggregation operator. We propose a dynamic federated aggregation operator that dynamically discards those adversarial clients and allows to prevent the corruption of the global learning model. We assess it as a defense against adversarial attacks deploying a deep learning classification model in a federated learning setting on the Fed-EMNIST Digits, Fashion MNIST and CIFAR-10 image datasets. The results show that the dynamic selection of the clients to aggregate enhances the performance of the global learning model and discards the adversarial and poor (with low quality models) clients.R&D&I grants - MCIN/AEI, Spain PID-2020-119478GB-I00 PID2020-116118GA-I00 EQC2018-005-084-PERDF A way of making EuropeMCIN/AEI FPU18/04475 IJC2018-036092-

    Survey on Federated Learning Threats: concepts, taxonomy on attacks and defences, experimental study and challenges

    Get PDF
    Federated learning is a machine learning paradigm that emerges as a solution to the privacy-preservation demands in artificial intelligence. As machine learning, federated learning is threatened by adversarial attacks against the integrity of the learning model and the privacy of data via a distributed approach to tackle local and global learning. This weak point is exacerbated by the inaccessibility of data in federated learning, which makes harder the protection against adversarial attacks and evidences the need to furtherance the research on defence methods to make federated learning a real solution for safeguarding data privacy. In this paper, we present an extensive review of the threats of federated learning, as well as as their corresponding countermeasures, attacks versus defences. This survey provides a taxonomy of adversarial attacks and a taxonomy of defence methods that depict a general picture of this vulnerability of federated learning and how to overcome it. Likewise, we expound guidelines for selecting the most adequate defence method according to the category of the adversarial attack. Besides, we carry out an extensive experimental study from which we draw further conclusions about the behaviour of attacks and defences and the guidelines for selecting the most adequate defence method according to the category of the adversarial attack. This study is finished leading to meditated learned lessons and challenges

    Negation Scope Identification in Spanish Reviews

    Get PDF
    El análisis de opiniones es una tarea a la que le quedan muchos frentes abiertos aún para que se pueda considerar resuelta. Entre ellos destaca el tratamiento de la negación, dado que una opinión negativa puede ser expresada con términos positivos negados. La negación es una característica particular de cada idioma, por lo que su tratamiento debe ajustarse a las singularidades del idioma en cuestión. En este artículo se presenta una aproximación lingüística para la identificación del ámbito de la negación en español, que se ha aplicado en un sistema de clasificación de la polaridad de opiniones sobre películas de cine.Sentiment Analysis is a task that still has several opened challenges. One of those challenges is the treatment of the negation, because a negative opinion can be built using negated positive words. Negation is a particular feature of each language, thus it must be considered differently per each language. In this article is shown a linguistic approach for the negation scope identification with the aim of integrating it in a polarity classification system in the domain of movie reviews.Este trabajo ha sido parcialmente financiado por el Fondo Europeo de Desarrollo Regional (FEDER), el proyecto ATTOS (TIN2012-38536-C03-0) del Gobierno de España, el proyecto AORESCU (TIC-07684) del Gobierno regional de la Junta de Andalucía y el proyecto CEATIC-2013-01 de la Universidad de Jaén
    corecore